Lazy Classifiers Using P-trees
نویسندگان
چکیده
Lazy classifiers store all of the training samples and do not build a classifier until a new sample needs to be classified. It differs from eager classifiers, such as decision tree induction, which build a general model (such as a decision tree) before receiving new samples. K-nearest neighbor (KNN) classification is a typical lazy classifier. Given a set of training data, a knearest neighbor classifier predicts the class value for an unknown tuple X by searching the training set for the k nearest neighbors to X and then assigning to X the most common class among its k nearest neighbors. Lazy classifiers are faster at training time than eager classifiers, but slower at predicating time since all computation is delayed to that time. In this paper, we introduce approaches to efficient construction of lazy classifiers, using a data structure, Peano Count Tree (P-tree). P-tree is a lossless and compressed representation of the original data that records the count information to facilitate efficient data mining. With P-tree structure, we introduced two classifiers, P-tree based k-nearest neighbor classifier (PKNN), and Podium Incremental Neighbor Evaluator (PINE). Performance analysis shows that our algorithms outperform classical KNN methods.
منابع مشابه
Combining Classifiers in Multimodal Affect Detection
Affect detection where users’ mental states are automatically recognized from facial expressions, speech, physiology and other modalities, requires accurate machine learning and classification techniques. This paper investigates how combined classifiers, and their base classifiers, can be used in affect detection using features from facial video and multichannel physiology. The base classifiers...
متن کاملSemi-Lazy Learning: Combining Clustering and Classifiers to Build More Accurate Models
Eager learners such as neural networks, decision trees, and naïve Bayes classifiers construct a single model from the training data before observing any test set instances. In contrast, lazy learners such as Knearest neighbor consider a test set instance before they generalize beyond the training data. This allows making predictions from only a specific selection of instances most similar to th...
متن کاملk-nearest Neighbor Classification on Spatial Data Streams Using P-trees
Classification of spatial data has become important due to the fact that there are huge volumes of spatial data now available holding a wealth of valuable information. In this paper we consider the classification of spatial data streams, where the training dataset changes often. New training data arrive continuously and are added to the training set. For these types of data streams, building a ...
متن کاملPruning Techniques in Associative Classification: Survey and Comparison
Association rule discovery and classification are common data mining tasks. Integrating association rule and classification also known as associative classification is a promising approach that derives classifiers highly competitive with regards to accuracy to that of traditional classification approaches such as rule induction and decision trees. However, the size of the classifiers generated ...
متن کاملUTD-HLT-CG: Semantic Architecture for Metonymy Resolution and Classification of Nominal Relations
In this paper we present a semantic architecture that was employed for processing two different SemEval 2007 tasks: Task 4 (Classification of Semantic Relations between Nominals) and Task 8 (Metonymy Resolution). The architecture uses multiple forms of syntactic, lexical, and semantic information to inform a classification-based approach that generates a different model for each machine learnin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002